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DESCRIPTION OF THE INVENTION 



Field of the Invention 



[001] The present invention relates to the field of neural networks. More 
particularly, the present invention, in various specific embodiments, involves 
methods and systems directed to providing a back-propagation neural network with 
enhanced neuron characteristics. 
Background of the Invention 

[002] Neural networks provide a practical approach to a wide range of 
problems. Specifically, they offer new and exciting possibilities in fields such as 
pattern recognition where traditional computational methods have not been 
successful. Standard computing methods rely on a linear approach to solve 
problems, while neural networks use a parallel approach similar to the workings of 
the brain. A neural network models the brain in a simple way by utilizing a number 
of simple units linked by weighted connections. The network is divided into layers 
that have different tasks. Within each layer there are individual units that are 
connected to units in layers above and below it. 

[003] All neural networks have input and output layers. The input layer 
contains dummy units which simply feed the input values into the network. Each unit 
has only one input or one output and the units perform no other function. The output 
layer units are processing units and perform calculations. They also contain the 
output that the network has produced. The individual processing units are 
connected to units in layers above and below it. The connections from the input 
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layer and onward are themselves weighted, this represents the strength of that 
connection. 

[004] For processing units within hidden and output layers, each unit has a 
number of inputs (that are outputs from other units) and one output value. The 
function of the processing unit is to process its inputs and produce an output 
depending upon the value of the inputs. This is done by performing a sum of the 
inputs multiplied by the weight of the connection to which a particular input came. 

[005] In performing their processing, neural networks utilize a transfer 
function. The transfer function describes the rule that the processing unit uses to 
convert the activation input to an output. The function itself can be any function 
although it is more useful if it is continuous so that all possible activation inputs will 
have a corresponding output. The weights are a means of controlling the network so 
that it can learn. The weights control the activation so it directly affects the output of 
a processing unit. Adjusting the weights can allow the network to learn and 
recognize patterns. 

[006] For example, suppose that a single processing unit has a target output 
value. If the output of the processing unit is lower than the target, then the weight(s) 
can be increased until the activation is high enough for the output to be correct. 
Conversely if the output is too high then the weights can be reduced. 

[007] Artificial neural networks using a back-propagation algorithm provide a 
practical approach to a wide range of problems. Their hardware implementation 
may be necessary and essential because of the normal requirements for many 
applications. Hardware implemented in back-propagation neural networks can be 
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trained in several ways including off-chip learning, chip-in-the-loop learning and on- 
chip learning. In off-chip learning, all computations are performed off the chip. Once 
the solution weight state has been found, the weights are downloaded to the chip. In 
the chip-in-the-loop application, the errors are calculated with the output of the chip, 
but the weight updates are calculated and performed off the chip. In the case of on- 
chip learning, the weight updates are calculated and applied on the chip. Deciding 
which of the aforementioned three methods to apply is not always clear-cut in 
practice and may depend not only on the application, but also on the network 
topology, specifically, constraints set by the network topology. On-chip learning is 
advantageous when the system requires the following: 1 ) higher speed; 2) 
autonomous operation in an unknown and changing environment; 3) small volume; 
and 4) reduced weight. 

[008] One of the most important components of the neural network is the 
neuron, whose performance and complexity greatly affect the whole neural network. 
In prior art neural networks, the activation function of the neuron is the sigmoid. In 
the on-chip back-propagation learning method, both a non-linear function, such as 
the sigmoid, and its derivative are required. Increasingly, neural networks are 
required that utilize a simple neuron circuit that realizes both a neuron activation 
function and its derivative. Existing neural networks do not provide on-chip neuron 
circuits that realize both a neuron activation function and its derivative. In addition, 
existing neural networks do not provide a threshold and gain factor of a neuron that 
can be easily programmed according to different requirements. 
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SUMMARY OF THE INVENTION 

[009] In accordance with the current invention, a back-propagation neural 
network with enhanced neuron characteristics method and system are provided that 
avoid the problems associated with prior art neural networks as discussed herein 
above. 

[010] In one aspect, a neural network system includes a feedforward 
network comprising at least one neuron circuit for producing an activation function 
and a first derivative of the activation function, a weight updating circuit for providing 
updated weights to the feedforward network. The system also includes an error 
back-propagation network for receiving the first derivative of the activation function 
and to provide weight change data information to the weight updating circuit. 

[01 1] In another aspect, a method for establishing a neural network includes 
producing an activation function and a first derivative of the activation function 
utilizing at least one neuron circuit in a feedforward network. Next, the method 
includes providing updated weights to the feedforward network utilizing a weight 
updating circuit. Finally, the method includes receiving the first derivative of the 
activation function by an error back-propagation network and providing weight 
change data information to the weight updating circuit from the error back- 
propagation network. 

[012] Both the foregoing general description and the following detailed 
description are exemplary and are intended to provide further explanation of the 
invention as claimed. 
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BRIEF DESCRIPTION OF THE DRAWINGS 



[013] The accompanying drawings provide a further understanding of the 
invention and, together with the detailed description, explain the principles of the 
invention. In the drawings: 

[014] FIG. 1 is a functional block diagram of a back-propagation network 
structure 100 consistent with the present invention; 

[015] FIG. 2 is a functional block diagram of a neural network system 200 
consistent with the present invention; 

[016] FIG. 3 is a functional block diagram of a weight unit 220 used in 
conjunction with the neural network system 200 of FIG. 2 consistent with the present 
invention; 

[017] FIG. 4 is a functional block diagram of a neuron circuit 400 used in 
conjunction with the neural network system 200 of FIG. 2 consistent with the present 
invention; 

[018] FIG. 5 is a graphical representation of the results of a computer 
simulation of the neuron circuit 400 of FIG. 4 consistent with the invention 

[019] FIG. 6A is a graphical representation of the results of a computer 
simulation of the simulated activation function from the first differential circuit output 
465 of neuron circuit 400 of FIG. 4 with different thresholds; 

[020] FIG. 6B is a graphical representation of the results of a computer 
simulation of the simulated activation function from the first differential circuit output 
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465 of neuron circuit 400 of FIG. 4 with various gain factors; 
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[021] Figure 7A is a graphical representation of the results of a computer 
simulation of the transient output of the training of the neural network system 200 of 
FIG. 2; and 

[022] FIG. 7B is a graphical representation of the results of a computer 
simulation of the sin(xj function approximation. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

[023] Reference will now be made to various embodiments according to this 
invention, examples of which are shown in the accompanying drawings and will be 
obvious from the description of the invention. In the drawings, the same reference 
numbers represent the same or similar elements in the different drawings whenever 
possible. 

[024] Fig. 1 shows a back-propagation network structure 100 consistent with 
the invention. Back-propagation network structure 100 comprises input layer 105, 
hidden layer 110, and an output layer 115. Using switches, a re-configurable back- 
propagation network can be formed in which both the number of layers and the 
number of neurons within each layer can be adjusted. The transfer function of each 
neuron is the sigmoid function expressed by equation 1 , 

1 



[025] f(s) = 



(1) 



[026] where or is the gain factor and s is the sum of the weighted inputs. 
With R as the number of the training set elements, wj is the weight between the P 

(0</<n) neuron of the (/-1) fh layer and the/ 1 neuron of the P (/=1 ,2, , L) layer, 

and 9j is the threshold of the f h neuron of the P layer. For convenience, let 6j=wJ, 
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x n M =1 . For a certain training sample ,2, ,R), xj' 1 is the output of the f h 

neuron of the (M) tt layer; x;,/is the output of the f h neuron of the I th layer; and 4 r is 
the target value when /=L; sj is the weighted sum from the neurons of the (M) f/7 
layer to thef h neuron of the layer. The feedforward calculation can be expressed 
as follows, 



[027] x i . r (k)=f(s i u (k))=f 



\i=0 



(2) 



[028] To describe the error back-propagation process, several definitions 
should be made first. The neuron error is defined as, 



[029] 4 >r (*) = 



tj 9 r X j,r(k)-> 



l = L 
\<1<L 



(3) 



[030] where the weight error is defined as, 

[031] Sl{k) = f\s[ r {k))sl{k) 

[032] The weight updating rule can be expressed as equation (5), 

[033] W & (k+l) = W 9 (k)+tjf j S^(k)x l lr (k) 



(4) 



(5) 



[034] when r\ is the learning rate, Awj(&+1) = J]^,, (*)*/,,(*) is the weight 



change. 

[035] FIG. 2 shows a neural network system 200 consistent with the present 
invention that may be designed according to the back-propagation network structure 
100 as discussed with respect to FIG. 1 . Neural network system 200 comprises a 
feedforward network 205, a weight updating circuit 210, and an error back- 
propagation network 215. In feedforward network 205, the synapse is realized by 
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the Gilbert multiplier, which is simple and area-economic. A nonlinear l-V transfer 
function is accomplished by a neuron. Using the forward difference method, the 
neuron generates a sigmoidal function and its derivative. The derivative is used in 
the error back-propagation network 215 that also includes multipliers. A weight unit 
220 implements the weight update operations as shown in FIG. 3. Weight unit 220 
comprises an analog to digital converter 305, which may comprise a 7-bit analog to 
digital converter. Analog to digital converter 305 is used to convert the analog 
weight change signal into digital form, which may then be added by an adder 310 to 
a 12-bit weight. The new weight from the output of adder 310 is converted to an 
analog signal by a digital to analog converter 31 5 for the next feedforward 
calculation. The new weight is stored in a random access memory 320 for the next 
weight updating. 

[036] FIG. 4 shows a neuron circuit 400 consistent with the invention 
comprising a linear resistor circuit 405, a first differential circuit 410, and a second 
differential circuit 415. 

[037] Linear resistor circuit 405, having a resistance value R A b, comprises a 
first linear resistor circuit transistor 420 including a first linear resistor circuit 
transistor gate voltage 425 {V N ), a second linear resistor circuit transistor 430 
including a second linear resistor circuit transistor gate voltage 435 (V P ), a linear 
resistor circuit output 440, and a first reference voltage 445 {V re fi\ First reference 
voltage 445 (V ref1 ) is chosen so that both first linear resistor circuit transistor 420 and 
second linear resistor circuit transistor 430 transistors work in their linear range. 
Linear resistor circuit 405 can be controlled by first linear resistor circuit transistor 
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gate voltage 425 (V N ) and second linear resistor circuit transistor gate voltage 435 
(Vp). 

[038] First differential circuit 410 comprises a first differential transistor pair 
450, a first differential transistor pair first port 455, a first differential transistor pair 
second port 460, a first differential circuit output 465 (V ou ti), and a second reference 
voltage 490 (IW)- Second differential circuit 415 comprises a second differential 
transistor pair 470, a second differential transistor pair first port 475, a second 
differential transistor pair second port 480, a second differential circuit output 482, 
and a third reference voltage 495. 

[039] Both first differential transistor pair 450 and second differential 
transistor pair 470 may comprise simple differential transistor pairs comprising 
identical transistors. First differential transistor pair first port 455 and second 
differential transistor pair first port 475 are electrically connected to linear resistor 
circuit output 440. First differential transistor pair second port 460 is supplied with 
second reference voltage 490 (V re f 2 ) that may comprise a fixed voltage. Similarly, 
second differential transistor pair second port 480 is supplied with third reference 
voltage 495 that may comprise VW-A V, where A V is fixed small voltage. I re fi and 
/re/2 are fixed current sources and V M may be supplied with a 3.3v voltage source. 

[040] With its respective active load, first differential circuit 410 realizes a 
sigmoidal shaped activation function at first differential circuit output 465 ( V ou ti). 
Similarly, with its respective active load, second differential circuit 415 realizes a 
signal at second differential circuit output 485 (V ou t2)- When the signal at second 
differential circuit output 485 (V out2 ) is subtracted from first differential circuit output 
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465 (Vouff), the approximate derivative of the sigmoidal shaped activation function at 
first differential circuit output 465 {V ou ti) is realized. 

[041] Assuming that the transistors of first differential transistor pair 450 are 
operating in saturation and follow an ideal square law, the drain current of transistor 
connected directly to linear resistor circuit output 440 can be expressed as 



[042] " " ' ' " (6) 

[043] with the input differential voltage V^V^Vb-V^) in a finite region of 



2/ 



[044] \y d \^-sfl 



(7) 



[045] Here (3 is the transconductance parameter for the transistors of first 
differential transistor pair 450. 



21 

[046] When l in is small, V d > — ^ , V ou ti remains the low saturation 

V fi 

voltage. As l in increases, Vb descends tardily and V out i increases slowly. When 



2/ 

V d < "i| — ^ ' Vout1 reaches ancl remains the high saturation level. 

[047] Assuming that V out = V ou tOin) is the generated neuron activation 
function, using the forward difference method, the approximate derivative voltage 
Vderiv is achieved by subtracting V out2 from V ou ti as follows 

[048] 



v\ ut {i in ) = v\ ut {v d >r d (i in ) = - 



Vo»<(V B -V ref2 +AV)-V out (V B -V ref2 ) 



AV 



R 



AB 



(8) 
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[049] 

v deriv (4 V\ut (4 ) = <V out (V B - (V ref2 -AV))- V out (V B - V ref2 )) = V oua - V out2 (9) 

K AB 

[050] FIG. 5 shows the results of a computer simulation of neuron circuit 400 
of FIG. 4 consistent with the invention. This simulation was performed using 
HSPICE circuit simulation software marketed by Avanti Corporation of 46871 
Bayside Parkway, Fremont, CA, 94538. The computer simulation of neuron circuit 
400 was performed using level 47 transistor models for a standard 1 .2 pm CMOS 
process. The dash-dot line and the dash line of FIG. 5 show the simulated activation 
function from first differential circuit output 465 of neuron circuit 400 and its fitted 
sigmoid function respectively. From this simulation, the error between the fitted 
sigmoid function and first differential circuit output 465 is less than 3%. The solid 
line and the dot line of FIG. 5 show the derivative found by the simulation of neuron 
circuit 400 and the first derivative of a fitted sigmoid function respectively. From this 
simulation, the relative error between the first derivative of a fitted sigmoid function 
and the first differential circuit output 465 minus the second differential circuit output 
485 is less than 5%. 

[051] One advantage of neural network system 200 of FIG. 2 is derived from 
its ability to adapt to an unknown and changing environment. Therefore, good 
programmability is of fundamental importance. Specifically, different applications of 
neural network system 200 may need different gain factors a and threshold vector 0. 
Different gain factors a and threshold vector 0 can be realized by varying l re fi, first 
linear resistor circuit transistor gate voltage 425 (V N ), and second linear resistor 
circuit transistor gate voltage 435 (V P ). 
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[052] The threshold vector 0 can be adjusted by changing the reference 



21, 



refl 



current / re «. When / re « increases, the current l m needed to satisfy V B -V ref2 > 

V P 

decreases, so the activation curve shifts to the left. Otherwise, the curve shifts to 
the right. 

[053] The gain factor or can be varied by changing first linear resistor circuit 
transistor gate voltage 425 (V N ) and second linear resistor circuit transistor gate 
voltage 435 (V P ). When both first linear resistor circuit transistor 420 and second 
linear resistor circuit transistor 430 are working in their linear range and their sizes 
are chosen in such a way that j3f= j8 2 , the equivalent linear resistor value R A b is 
written as 



[054] R M =- 



1 



(10) 



PW N -v p )-(v Tl +\v r2 1)] 

[055] Equation 1 0 shows that the bigger ( V N -V P ) is, the less R A b is. That is, 
the less the slope of Vb versus /,-„ is. This means that the more slowly V ou ti 
increases, the smaller the gain factor. 

[056] FIG. 6A shows the simulated activation function from first differential 
circuit output 465 of neuron circuit 400 with different thresholds. Different simulated 
activation functions from first differential circuit output 465 of neuron circuit 400 with 
various gain factors are shown in FIG. 6B. One advantage of neuron circuit 400 is 
that the saturation levels of the activation function from first differential circuit output 
465 remains constant for different gain values. This ensures that for different gain 
values, the input linear range of synapse in a subsequent layer is completely used. 
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[057] Two additional experimental HSPICE simulations are shown in FIG 7A 
and 7B to illustrate the operation of neural network system 200. The first experiment 
involves the non-linear partition problem, while the second involves the sin(xj 
function approximation. 

[058] FIG. 7A shows the transient output of the training of neural network 
system 200 configured as a 2-1 MLP. Considered that the low output voltage of the 
neuron is 0.52V, the high output voltage is 2.59V and the middle voltage is 1 .56V, 
the simulation can be described as follows: if the two inputs are both lower than 
1 ,56V or both greater than 1 .56V, the output is 2.59V; otherwise the output is 1 .56V. 
The corresponding input of lines, A, B, C and D in FIG. 7A are (1 V, 1 V), (1 V, 2V), 
(2V, 1 V), (2V, 2V) respectively and the corresponding targets are 0.52V, 2.59V, 
2.59V and 0.52V respectively. It can be seen from FIG. 7A that the neural network 
system 200 reaches convergence within 1 ms. 

[059] The second experiment is the sin(x) function approximation, wherein a 
1-5-1 configuration is used. These results are shown in FIG. 7B., in which the 
training set elements are shown as well as outputs of neural network system 200. It 
can be seen from the FIG. 7B that neural network system 200 can approximately the 
sin(x) function accurately. 

[060] It will be appreciated that a system in accordance with the invention 
can be constructed in whole or in part from special purpose hardware residing on 
one or a plurality of chips, a general purpose computer system, or any combination 
thereof, any portion of which may be controlled by a suitable program Any program 
may in whole or in part comprise part of or be stored on the system in a conventional 
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manner, or it may in whole or in part be provided in to the system over a network or 
other mechanism for transferring information in a conventional manner. In addition, 
it will be appreciated that the system may be operated and/or otherwise controlled 
by means of information provided by an operator using operator input elements (not 
shown) which may be connected directly to the system or which may transfer the 
information to the system over a network or other mechanism for transferring 
information in a conventional manner. 

[061] Other embodiments of the invention will be apparent to those skilled in 
the art from consideration of the specification and practice of the invention disclosed 
herein. It is intended that the specification and examples be considered as 
exemplary only, with a true scope and spirit of the invention being indicated by the 
following claims. 
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